different algorithm
- Asia > China > Guangdong Province > Shenzhen (0.04)
- North America > United States > Massachusetts > Plymouth County > Norwell (0.04)
- Asia > Middle East > Jordan (0.04)
- (3 more...)
- Europe > Switzerland > Basel-City > Basel (0.05)
- Europe > Switzerland > Zürich > Zürich (0.04)
- Europe > United Kingdom > England > Greater London > London (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
- Information Technology > Modeling & Simulation (0.95)
- Information Technology > Data Science (0.93)
Causes and Effects of Unanticipated Numerical Deviations in Neural Network Inference Frameworks
Hardware-specific optimizations in machine learning (ML) frameworks can cause numerical deviations of inference results. Quite surprisingly, despite using a fixed trained model and fixed input data, inference results are not consistent across platforms, and sometimes not even deterministic on the same platform. We study the causes of these numerical deviations for convolutional neural networks (CNN) on realistic end-to-end inference pipelines and in isolated experiments. Results from 75 distinct platforms suggest that the main causes of deviations on CPUs are differences in SIMD use, and the selection of convolution algorithms at runtime on GPUs. We link the causes and propagation effects to properties of the ML model and evaluate potential mitigations. We make our research code publicly available.
- North America > Canada > Ontario > Toronto (0.14)
- Europe > Austria > Tyrol > Innsbruck (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Training Uncertainty
The first subset (in red) is utilized to evaluate a traditional accuracy-basedlossfunction `a,suchasthecrossentropy. This benchmark is based on a loss function designed to incentivize the trained model to produce the smallest possible conformal prediction sets with the desired coverage (e.g., 90% ifα = 0.1). The hybrid training procedure is similar to Algorithm 1, in the sense that it relies on analogous soft-sorting, soft-ranking, and soft-indexing algorithms toevaluate adifferentiable approximation Wi oftheconformity scoreWi in(8). Above, the second equality follows directly from the fact thatS(x,U;π,t), defined in (A2), is by construction increasing in t, and therefore Y / S(x,U;π,1 α) if and only if min{t [0,1]:Y S(x,U;π,t)}>1 α. The proof consists of showing that`a and`u are separately minimized by ˆπ = π,although only approximately inthelatter case.
- Asia > Middle East > Israel (0.04)
- North America > United States (0.04)
A Extension to k-Means and (k, p)-Clustering
The lower bound on opt( U) given in Lemma B.10 holds for ρ -metric spaces with no modifications. By making the appropriate modifications to the proof of Theorem C.1, we can extend this theorem to In particular, we can obtain a proof of Theorem A.5 by taking the proof of Theorem C.1 and adding extra ρ factors whenever the triangle inequality is applied. We first prove Lemma B.1, which shows that the sizes of the sets U By Lemma B.2, we get that Henceforth, we fix some positive ξ and sufficiently large α such that Lemma B.3 holds. By now applying Lemma B.4 it follows that The following lemma is proven in [25]. Lemma B.1, the third inequality follows from Lemma B.7, and the fourth inequality follows from the The second inequality follows from Lemma B.8, the third inequality from averaging and the choice Proof of Lemma 3.3: It follows that with probability at least 1 e Hence, by Theorem D.1, we must have that O (poly( k)) query time must have Ω( k) amortized update time.
- Asia > Afghanistan > Parwan Province > Charikar (0.04)
- North America > United States > Texas > Travis County > Austin (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- (3 more...)
- North America > United States > Illinois > Champaign County > Urbana (0.04)
- North America > Canada (0.04)
- Africa > Ghana > Greater Accra > Accra (0.04)
SupplementaryMaterial
We provide additional results for EGTA applied to networked MARL system control for CPR management. Restraint percentages under different regeneration rates The heatmaps in Figure 7 (A-C) highlight the differences in restraint percentage for different values ofα as the regeneration rate is changed from high(0.1)to In the case where agents are completely self-interested (α = 0)shownin(A), themajority ofalgorithms without communication display verylowlevels of restraint for all rates of regeneration. The orange ovals in these diagrams indicate which system configurations correspond to the highest expected payofffor all agents. Schelling diagrams using a different parameterisation An alternative parameterisation for a Schelling diagram is to plot payoffs for a particular agent (cooperating or defecting) with respect to the number ofother cooperators on thex-axis, instead of thetotalnumber of cooperators.